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Abstract — This article introduces a simple and effective 
methodology to determine the level of congestion in a network 
with an ECN-like marking scheme. The purpose of the ECN bit 
is to notify TCP sources of an imminent congestion in order to 
react before losses occur. However, ECN is a binary indicator 
which does not reflect the congestion level (i.e. the percentage of 
queued packets) of the bottleneck, thus preventing any adapted 
reaction. In this study, we use a counter in place of the traditional 
ECN marking scheme to assess the number of times a packet 
has crossed a congested router. Thanks to this simple counter, we 
drive a statistical analysis to accurately estimate the congestion 
level of each router on a network path. We detail in this paper 
an analytical method validated by some preliminary simulations 
which demonstrate the feasibility and the accuracy of the concept 
proposed. We conclude this paper with possible applications and 
expected future work. 

Index Terms — Congestion estimation, ECN, measurements 

I. Introduction 

While dropping packets to prevent congestion was consid- 
ered as a paradox, many studies have shown the undeniable 
assets of the Explicit Congestion Notification flag IITil . The 
story starts in 1994 when Sally Floyd shows that this notifi- 
cation allows to increase TCP performances [2| and later in 
Q, where the authors reach similar conclusion concerning 
the web traffic. At last, Aleksandar Kuzmanovic in "The 
Power of Explicit Congestion Notification" |6| investigates the 
pertinence of ECN and demonstrates once again, that ECN's 
users will obtain better performances even if all the Internet 
is not fully ECN-capable. 

The following study |8| published in 2004 precises that 
ECN is only used by 2,1% of computers and that this 
low percentage can be partly explained by firewall, NAT 
and other middle-boxes of the Internet which reset (without 
any justification) the ECN flag. However, this is definitely 
not the main reason. Indeed, although this flag is currently 
implemented both in end-hosts (GNU/Linux, Mac OSX and 
Windows Vista) and inside the core network (Cisco lOS 
implements a RED/ECN variant called WRED/ECN), ECN 
remains surprisingly disabled by default for all these systems. 
Concerning end-hosts, this might appear paradoxical. While 
today CUBIC and Compound TCP variants are enabled by 
default (respectively in GNU/Linux and Windows Vista) and 
are still under debate concerning their friendliness with the 
current Newreno TCP version, a proved mechanism as ECN 
is not. 



We believe this trend has two main reasons: firstly, this 
is partly due to the behaviour of TCP face to ECN marked 
packets. Indeed, the goal of the ECN bit is to notify TCP 
sources of an imminent congestion but this binary indicator 
does not reflect the real network congestion level. Intuitively, 
CUBIC and Westwood protocols might better perform than 
TCP Newreno/ECN due to the nature of the information 
returned by the ECN binary signal which does not provide 
any quantitative estimation of the congestion level allowing 
TCP to efficiently adapt its sending rateQ. In other words, 
whatever the number of ECN marked, the TCP reaction is 
to halve the congestion window and this action is not well 
adapted to all cases. Secondly, CUBIC and Westwood are 
pure end-to-end solutions and as a result, are much more 
easier to deploy while TCP/ECN must involve both the core 
network and the end-hosts. However, several research work 
demonstrate that the design of a mechanism to optimally 
manage network congestion and capacity while being fair with 
other flows cannot be done without network collaboration [3\, 
||5l , IfTOl . Unfortunately and to the best of our knowledge, the 
major barrier is that we do not have today a solution, that 
do not involve complex computation inside the core routers 
(such as BMCC |10| or XCP [5|), able to assess at the sender 
side the exact congestion level of the bottleneck of the path 
allowing a transport protocol such as TCP to correctly react 
to this congestion. For instance, BMCC introduces complex 
mechanisms inside the router and is only compliant with IPv4 
(due to the use of the 16 bits IP id field of the IPv4 header) 
while XCP involves large architectural changes. 

This fact motivates the present study which proposes a 
statistical algorithm to assess the congestion level at the end- 
hosts side (i.e. receiver or sender sides) without involving 
complex computation inside the core network. In particular, 
we aim at providing a practical solution to return concrete 
congestion measurements to the sender in order to avoid blind, 
approximate or excessive reaction from the source. The only 
modification deals with the marking method which is changed 
from a binary field to a count field similar to the TTL field 
from the IP packet. Practically, we do not have to extend the 
IP headers as the DiffServ Codepoint field is large enough 
to enable our proposal. We could argue, as in ifTOl . whether 

' We remark that there is a lack of performances evaluation study between 
ECN-compliant protocols and new proposals such as CUBIC for instance. 
At least, a recent study clearly shows a clear disequilibrium between TCP 
Newreno and CUBIC |12]. 
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such modification involves or not heavy IETF standardization 
process, however we claim that it would be much more 
complex and uncertain to convince networking companies 
to add complex estimation method inside their own routers. 
Furthermore, this solution is generic enough to consider, as 
for ECN, this flag either as a simple binary indicator or as 
a counter Finally, we point out that a recent IETF group 
named ConEx (Congestion Exposure) f9l, attempts to enable 
congestion to be exposed within the network layer of the 
Internet. The main candidate solution is to date re-ECN IT] 
and propose the use of a second bit inside the IP header in 
order to differentiate the congestion upstream and downstream 
from an observation point inside the network. Internet service 
providers are pushing this idea as this would provide an 
essential tool (currently missing) to better manage and control 
their trafficQ. If this solution is adopted, we could assist to a 
larger deployment of the ECN field that would facilitates the 
deployment of our proposal. 

Following this new marking scheme, we propose a simple 
method which permits an accurate estimation of the congestion 
level experienced inside the routers of a given path. We present 
in a first part the mathematical basis of our proposition then, 
we develop in a second part our simulations and the practical 
analysis to evaluate the congestion level. Finally, we conclude 
about the possibility offered by this solutions and detail the 
remaining work. 

II. Marking proposal 

The ECN bit, as defined in RFC 3168, is a binary field 
of the IP header. This field can only contain a boolean value 
which informs a sender whether if a packet has crossed at 
least one congested router. Thus, it is impossible to distinguish 
a packet marked one time from those marked several times 
and which would have crossed several congested routers. 
This prevents any accurate metrology analysis of the link 
observed for the sake, for instance, of an adapted reaction 
from the source. In fact, a packet ECN-capable crossing a link 
composed by two routers and respectively marking at 30% 
and 40% will have a probability to be marked of 58% (i.e. 
1 - (1 - 0.4)(1 - 0.3)). Obviously, this does not reflect the 
level of congestion of the network bottleneck (in this example: 
40%) and could lead to an excessive reaction from the source. 
Thus, we propose to enhance the information returned with an 
incremental field (denoted ECN*) to count how many times 
a packet is marked. The marking scheme, as for RED/ECN, 
strictly follows the RED algorithm [4]. We will use this new 
metric (i.e. how many times a packet is marked) to determine 
the level of congestion of the bottleneck. A RED/ECN* router 
will increment this counter instead of simply setting the ECN 
field to one. Through the analysis of the data received, a source 
can build the distribution of the marked packets. Obviously, we 
cannot use this metric as it stands, in the following, we present 
the analytical method to interpret the data coUected. 

III. Analytical Study 

We present in this part the statistical analysis allowing us 
to process the data collected with our marking proposal. The 

-See the IETF [re-ecn] mailing-list and |9| for further details. 



results obtained allow to establish a relationship between the 
frequency of ECN* marked packets and the queue size of 
routers of the path. 

A. Hypothesis and notations 

We consider a topology of n core routers in a row. For 
1 ^ « ^ n , we note R^ the router number i. All these n 
routers adopt the previously exposed ECN* marking scheme. 
Each router drops packets only if its queue is full. We consider 
that the congestion inside the network is stable. Thus, this 
relative network stability induces a constant congestion level 
and as a result, a constant average queue size for each router. 
Moreover, we know that a router decides to mark a packet 
only by analyzing its average queue size, thus we expect to 
obtain a constant marking probability for each routers. We call 
"marking rate" this probability. In the rest of this paper, we 
adopt the following notations: 

• n: number of congested routers; 

• Pi. marking rate of the i*'* router from a path of n routers; 

• M^: a packet is marked k times; 

« p{MJ}y. the probabiHty of the event MJJ; 

• c^?: the A:*'' elementary symmetric polyno- 
mial with n variables. We remind: crj! = 

B. A first simple example : case of two routers 

Let's assume a topology of two congested core routers Ri 
and i?2 (n = 2). In this example, we want to determine the 
marking rate of both routers with data collected by the sender 
positioned before In the same way as standard ECN which 
uses an ECN echo, the value of the counter ECN* is sent 
back to the sender with the TCP acknowledgement. Following 
the previous notations, we call pi and p2 the marking rate of 
respectively i?i and i?2- A simple calculation shows that a 
connection will observe a packet marked with a probability of 
1 — (1— Pi)(l— P2)- Thus, with a standard ECN field, the sender 
cannot differentiate the two marking rates and so interprets 
a global congestion which is higher and not representative 
of the real congestion state. With our proposition ECN*, we 
refine this information sent back to the sender thanks to the 
determination of the marking rate of each crossed router Thus, 
the sender can determine the level of the bottleneck queue 
and so could react in a more adapted way to the congestion 
state. In this example, we can estimate the ratio not only of 
the marked packets but also of packets marked one and two 
times. The sender can now estimate p(Mf ) and p(M|). These 
values become the new entries of the problem. If we develop 
these probabilities we have: 

p{Mi) = pi{l - p2) + P2{1 - Pi) ^ <jI - 2<jI 
P{Mi) = PiP2 = cr| 
which is equivalent to: 

P{Ml) = (o)-?-(j)-2 

P(M|) = C^ai 
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Thanks to these equations, the sender can easily determine 
and cr|. Thus, using the existing relationship between the 
polynomial coefficients and the elementary symmetric function 
of its roots, the sender can evaluate pi and p2 (here, pi and 
P2 are the roots of the polynomial P{x) = - (jf x + 
(t|). We detail in the following part how to compute in a 
more general way the polynomial to find the different pk ■ Of 
course, the sender cannot associate each marking rate with 
the corresponding router but it gets a correct estimation of the 
congestion level of the bottleneck. 

We develop this case as it constitutes the basis of the proof 
by mathematical induction for the general formula of p{Ml}). 
Indeed, when the distribution of the marked packets is done, 
the crucial step is the deduction of the cr^. To do this, we 
use the formula of p{M'^) and a basic system resolution. 
Then, as shown in the following part, the determination of 
the polynomial roots give us the different pi. 

The general formula has the following form: (this formula 
is demonstrated in Appendix) 

n — k , . \ 

Vk^l^k^n, piMJ!) = ^(-1)* r + M (1) 

(=0 ^ ^ 

C. Resolution 

Since the formula is now established, we now have to detail 
the operations a sender has to realize in order to deduce all the 
marking rates of the congested routers of its path. We detail 
and recall in this part the different steps mandatory to obtain 
the result. First of all, thanks to the distribution of the marked 
packets, the sender can estimate all the p{MJ}). Indeed, the 
p{M]}) value is only the ratio between the number of packets 
marked k times and the total number of received packets by 
the sender. Moreover, using ([T), the sender can compute the 
(T^\ Indeed, if we develop these relations we obtain : 



the Pi. We detail this step in the following. Let be P{x) a 
polynomial of degree n, we write P{x) as follows: 

n 

P{x) = 

m=0 

Let be Pi, 1 ^ i ^ n the n roots of P. Thus: 

Vfc, 1 ^ fc ^ n, cr^ ^ Pj^ ■ ■ -pj^ 

Moreover, we have the following relationships: 

V/c, 1 ^ A < cr^ = (-i)fe^!^ (3) 

an 

We set a„ = 1 in (|3]l. Then (|2|l becomes: 

n 

P{x)=Y,{~ir-'a:^,x'' 

k=0 

So, we have a n degree polynomial where the roots cor- 
respond to the n marking rates of the n crossed congested 
routers of the path. We just need now to estimate these roots. 

IV. Simulation 

In this section, we evaluate our algorithm with data obtained 
with an ns-2 simulation. This section is divided in three points. 
First, we present the topology used in the ns-2 simulation and 
the results. Then, we present the establishment of the solving 
polynomial and a subtlety for its resolution. Finally, we present 
our results, a comparison with expected results and a brief 
discussion about these two last points. 



p{M-) = a- 3<tS + ■■■ + i-lT'' (^^^ " 2) ^« 

The unknowns are the cr^, we obtain a diagonal system with 
n equations and n unknowns. The resolution is trivial. 

1) The Solving Polynomial: As the previous system is 
solved, all the uj^ are known. We now have to deduce the pi. 
As said previously, the cr^ are elementary symmetric functions. 
Thus, using the relationship between a polynomial and the 
elementary symmetric functions of its roots we can deduce 



A. Tests Topology and gathering of data 

The topology used for the tests is given Figure [1] We use 
TCP/Newreno flows and the reaction of the senders to ECN 
is disabled. As a result, they do not react with a decrease of 
their congestion window when they receive an ECN marked 
acknowledgement. We have implemented our ECN* field 
and all the RED/ECN* routers use the same parameters: 
mirith = 50, maxth = 100, maxp — 1 with a queue 
length of 100. Concerning the disturbing flows aggregate, an 
accurate tuning of the senders' emission window has been 
necessary to simulate a distributed congestion. The analysis 
of the data is done after 10 minutes when we consider the 
network stable (this corresponds to a generation of 50000 
packets). We analyze the two following TCP flows: the flow #1 
from SRCl to RCVl and the flow #2 from SRC2 to RCV2. 
The topology voluntary presents two routers in common to 
estimate the impact of crossed traffics on our algorithm. 

The statistic study consists in building the histogram of the 
distribution of the values of the ECN* marking field for the 
flows #1 and #2. These results are presented in Figures |2(a)| 
and[2(b)l 
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Fig. 1. Topology used for the simulation 




(a) Results for flow #1 (b) Results for flow #2 

Fig. 2. Distribution of ECN* mai'ked packets 

B. Determination of solving polynomial for flow #1 
Figure [2(a)] gives the following results (here n = 4): 

' p{Mf) = 0.4264 = erf - 2cr| + 3cr| - 4cr| 

p(M|) = 0.3134 = ct| - 3ct| + 6cr| 

' p(M|) = 0.0738 =(t|-4ct| 

^ p(M|) = 0.00548 =cr| 

We then deduce the following a^: 

a\ = 1.297 
cr| = 0.5676 
cr| = 0.0957 
cr| = 0.00548 

By applying the method previously described we have: 
P{x) = - 1.297a;3 + 0.5676a;2 - 0.0957x + 0.00548 

C. Practical Resolution 

As the solving polynomial is built, we now have to solve 
P{x) = 0. The four roots of P{x) correspond to the four 
marking rates of the four congested routers crossed by packets 
arriving to RCVl. As this problem is a stochastic one, we have 
to consider an uncertainty on the measurements obtained with 
the simulations. Indeed, unless having an infinite number of 
packets, we have to consider a drift. We take this possible 
drift in consideration in the determination of roots of P{x). 
Basically, we resolve P{x) — e for — 10^'^ ^ e ^ 10^'^. 
Thus, we obtain four "areas of roots" instead of "solving 
roots". We consider that the good value as the middle one. 
We note emin and emax the extreme values of e from which 
P{x) — e have four solutions. Indeed, if we have packets 
marked four times, we have to determine four solutions of 
the equation P{x) = e. This condition allows us to determine 
these four areas of roots. 



In our example, we obtain the four following areas of roots: 
[0.075,0.14] [0.14,0.28] [0.34,0.50] [0.52,0.57]. This allows 
us to deduce the four following marking rates : 11%, 21%, 
42% and 55%. With the same reasoning, we obtain for the 
flow #2 the four following root areas : [0.17, 0.23] [0.24, 0.38] 
[0.40,0.49] [0.73,0.74] and so the four following marking 
rates : 20%, 31%, 44% and 74%. These results are presented 
in the Tab U 

D. Results interpretation 

We now compare the results computed with the average 
queue length of each RED/ECN* routers measured during the 
simulation. Thus, we can deduce the real marking rate of each 
RED queue. These results are grouped and presented in the 
TabU They correspond to the roots computed for the flow #1 
and #2 in the previous section llV-CI We note that the observed 
average queue values have a low standard deviation. These 
values are almost constant for all the simulation. 





Average 


Theoretical 


Estimated 


Queue 


Size 


Marking 


Marking Rate 




(# pkts) 


Rate 


flow #1 


flow #2 


Queue 1 (E2-C1) 


55.5 


11% 


11% 





Queue2 (C1-C2) 


60.5 


21% 


21% 


20% 


Queues (C2-C3) 


72 


44% 


42% 


44% 


Queue4 (C3-E3) 


77.5 


55% 


55% 





Queue5 (El-Cl) 


65.5 


32% 





31% 


Queue6 (C3-E4) 


87 


74% 





74 % 



TABLE I 

Average queue length and corresponding theoretical marking 

RATE 



These results globally correspond to the estimations with a 
slight difference explained by the size of the sample. More- 
over, if we do a correlation between the results analytically 
obtained and those obtained by simulation in table U we can 
notice that flows #1 and #2 estimate two marking rates in 
common corresponding to the two common routers crossed 
by both flows. Thus, not only these results correspond to the 
expected ones but they also underline an important aspect: it 
seems these measurements are not disturbed with each other 
and are perfectly independents (several other measurements, 
not presented here tend to confirm this fact). In other words, 
this allows to drive several measurements in parallel on a 
same network. We also verify, thanks to this simulation, 
that the hypothesis of network stability is sufficient. Thus, if 
we assume that the path used is relatively constant and the 
congestion level remains stable, this method allows a good 
estimation of the congestion level of the different routers of a 
given path. 

E. Convergence of this method 

As detailed previously, we adopt a probabilistic approach 
to solve this problem. We admittedly take in consideration the 
measurement uncertainty by solving P{x) = e. Nevertheless, 
it is necessary to focus on the convergence time of this 
solution. It means to assess when the size of the sample 
is big enough to correctly determine the different marking 
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rates. To do so, we evaluate the different a, directly linked to 
the coefficients of the solving polynomial every 50 received 
packets. The evolution of these coefficients as a function of the 
number of received packets allow us to determine a threshold 
from which the value of these coefficients does not evolve 
anymore. A second threshold can also be set: the one which 
corresponds to the number of packets from which we can 
find the solutions to the equation P{x) = e. This approach 
is presented in figure |3(a)| 
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(a) Evaluation of sigmas as a function (b) Computed marking rates as a 
of tlie number of received packets function of the number of received 

packets 

Fig. 3. Evolution of sigmas and marking rates as a function of received 
packets 



As we can see in the Figure |3(b)| we only need 4000 
packets to have a correct estimation of the coefficients and 
about 8000 packets to reach a perfect estimation (equivalent 
to a 90 seconds transfer in our simulation) with an e ± 10^"^. 
If we focus on Figure |3(a)[ we can note that between 3000 
and 4000 received packets, the coefficients of the polynomial 
do not evolve much more. This underlines the accuracy 
necessary to establish the good solving polynomial. Indeed, we 
have to accurately estimate the p{MJ}) to have good results. 
Other simulations, not presented here, done over a similar 
topology but with routers less congested, have shown that these 
thresholds are slightly higher. In fact, the lower is the event 
corresponding to the marking of a packet, the higher the size 
of the sample has to be in order to observe this event and so to 
accurately estimate it. Respectively, the higher is the marking 
rates (equivalent to an important congestion) the smaller can 
be the size of the sample. 

V. Conclusion 

In this article, we have proposed to increase the level of 
congestion information returned by TCP feedback messages 
with an ECN* marking scheme. ECN* enables the ECN field 
to count how many times a packet has crossed a congested 
router We define an algorithm able to estimate the congestion 
level of each queue of a given path through the analysis of 
the data collected. These preliminary results suggest that this 
method is reliable and robust to cross traffics. In this study, 
we demonstrate the existing relationship between this ECN* 
marking rate and the filling level of each routers' queue. 

However, several others investigations need to be driven in 
the context of dynamic networks and concerning the size of the 
sample statistic set. This method is not complete and we are 
currently investigating an extension of this algorithm robust to 
the dynamic changing of the network, based on a novel way 



to determine the areas of root, to allow a faster convergence 
to the solution. Finally the ultimate step is obviously to de- 
termine how to interact with TCP congestion control and how 
TCP should take into account this new congestion feedback 
information. 
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Appendix 

To demonstrate 0, we use a proof by mathematical induction. The induction is done 

on the number of congested routers: n. 

Basis: the formula is demonstrated in paiT lIII-Bl 

Inductive step: p{M^^^) is the probability for a packet to be marked k times over 
a path of n + 1 routers. The event M^^^ can be decomposed. Indeed, be marked k 
times over a path of n + 1 routers is similar to be marked k dmes by the n first routers 
and not be marked by the router n + 1; or to be marked k — 1 times by the n first 
routers and be marked by the router n + 1. In terms of probability, this decomposition 
can be written as follows: 

Vfc, 1 ^ fc «: «, p(M^"+i) = p(M^)(l - p„ + i) + p(M^_i)p„+i (4) 
Moreover, we have the following relations: 



(5) 



Developing |4} and using (5) we have : 

vfc, 1 ^ fc ^ «,p(Af^"+i) = [jf (-1)' + ^r+j(i - p^+i) 

n-k + l 

+ [ E (-1) 



i + k - 1\ „ 



The formula is so demonstrated for n + 1. Then, we have : 



vfe, 1 ^ fc «: n + 1, p(Af,"+i) = j2 (-1)' ( t ) '^r+V 



QED 



